K-2 Assessment systems Enable 
Larly Intervention to Foster Student Success 


Joanne L. Jensen, WestEd 


NOG’ Perspectives 


Jessica Goldstein, Boulevard Research Partners LLC 


Matthew A. Brunetti, WestEd 


Across the United States, statewide assessments in English language arts 
and mathematics are federally mandated each school year in grades 3 
through 8 and once in high school. The intent is to help educators, policy- 
makers, and parents directly gauge how students and their school systems 
are performing against state standards. Missing nationwide, however, Is any 


systematic state-level attempt to evaluate students’ ongoing progress in 
grades K—2, the grades that lay the foundation for all later learning. 


As noted in the companion paper 

in this series, as many as 41 states 
(Center for Standards, Assessment, and 
Accountability, 2021; Weisenfeld et al., 
2020) assess incoming kindergarten- 
ers to understand how prepared each 
child is to participate in kindergarten 
curricula — a critical development, 
since research shows that, without 
effective intervention, performance 
gaps among students as they enter 
kindergarten persist into third grade 
(Duncan et al., 2007; Connor et al., 
2011; Neuman & Dickinson, 2001). 


However, for districts and schools 

to effectively intervene to change 
the trajectories for these young 
students, ongoing evaluation of 
students’ progress throughout kin- 
dergarten, first, and second grades is 
essential. And although many states 
provide different kinds of early grade 


assessments, no state currently has 
a multidimensional statewide K-2 
system. When and how to use available 
K—2 assessments is largely left up to 
local jurisdictions, with mixed results 
in terms of teachers’ ability to effec- 
tively identify and attend to students’ 
learning needs. 


This paper focuses on this lag in K-2 
assessment systems and how states can 
act to address it. We first review a range 
of assessment types and their utility for 
supporting learning in the early years 
of schooling. We then discuss research 
findings on state K—2 assessment 
policies that provide insights for other 
policy leaders to consider as they work 
to build K-2 assessment systems that 
effectively help districts and schools 
support academic success for their 
youngest students. 


ABOUT THIS BRIEF 


Many children start school 
needing extra support to thrive 
academically in grades K—-2 — 
the foundation for success as 
they move up the grades. This 
paper discusses designing early 
grade assessment systems that 
enable educators to intervene 
throughout the K—2 years to 
help students achieve success. A 
companion paper explains how 
states can lay the groundwork 
for addressing readiness gaps 
by identifying, at kindergarten 
entry, those children who may 
need extra support to thrive in 
the early grades. 
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Types of Assessments and 
Their Use for K-2 


In a 2018 summary of recent state policies 
on early grade assessment, the Council of 
Chief State School Officers (CCSSO) delin- 
eated five types of early grade assessments: 
summative, interim, screener, diagnostic, and 
formative (2019). The paper reported that 35 
states offered one or more K—2 assessments, 
with the majority requiring that all students 
be assessed. Some of these states, however, 
offer optional assessments or encourage 
districts to use some form of assessment to 
monitor student progress. Only two of the 35 
states provided summative K—2 assessments. 


Twenty states administered assessments for 
diagnostic/screening purposes; four provided 
only formative/interim assessments; and 
seven administered assessments for both 
diagnostic and formative purposes. While all 
state K-2 assessments targeted reading and 
literacy skills, only 11 of the 35 also offered 

a mathematics assessment. Six other states 
offered an assessment that included math- 
ematics and an additional subject area. 


Table 1 below, drawn from information 
presented in the 2019 CCSSO publication, 
describes the purpose, frequency, question 
addressed, and system-level use for each 
type of assessment. 


Table 1. Assessment Types, Purposes, Frequency, Questions Addressed, and System-Level Uses 


Assessment 
Purpose 


Type 


System-Level 


Questions Addressed 
Use 


Summative Evaluates whether stu- Once, typically Have these students met e State 
dents have met grade- toward the end of grade-level standards? e District 
level standards the school year e School 

e Classroom 

Interim Evaluates whether At key points Are students on track e District 
students are advancing throughout the year to meet grade-level e School 
toward achievement of standards by the end of e Classroom 
grade-level standards the year? 

Screener Identifies those who Typically, at the Do students require ¢ School 
may need extra support beginning of the additional support or e Classroom 
to attain desired learn- school year or as further evaluation? 
ing outcomes needed 

Diagnostic Determines the eligibil- | As needed, typi- What are students’ e Student 
ity of students for spe- cally based on the strengths and areas 
cialized programming results from other of specific need? Can 
or services assessments learning needs be 

diagnosed by additional 
focused assessment? 

Formative Checks students’ Daily Are students learning e Classroom 
understanding during what was planned for e Student 


the course of instruc- 
tion to guide teaching 
and learning 


them to learn? If not, 
how can understanding 
be improved to meet 
learning goals? 


Note. Although summative results are not typically available until after the conclusion of the school year, the results can inform 


classroom teaching practices for the subsequent school year. 
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More specifically, each of these assessment 
types has strengths and limitations, as follows: 


Interim and summative assessments. Interim 
and summative assessments are designed 

to periodically evaluate students’ progress 
toward, and achievement of, grade-level 
learning standards. Interim assessments are 
typically administered at key points through- 
out the year, while summative assessments 
are typically administered once toward the 
end of the school year. The uses of data from 
these assessments overlap considerably, as 
outlined in Table 2. 


There are two distinct uses for interim assess- 
ments: to predict performance on summative 
assessments and to evaluate student learning 
and progress toward end-of-year goals. 


Predictive interim assessments may include 


any grade-level learning standards. While 
they can help school- and district-level 
staff to identify students who are not 

on track to meet learning expectations 
required for promotion or meet the per- 
formance expectation associated with 

the summative assessment, they can also 
prompt teachers to identify curricula and 
approaches that may not be supportive of 


learning for all students. There is a risk that 


students may be assessed on content they 
have not yet had the opportunity to learn, 
whether through direct instruction or their 
own discovery. Consequently, a low score 


may reflect the pacing of instruction or 
lack of opportunity to learn rather than 
actual student learning. 


Table 2. Uses of Data From Interim and Summative Assessments 


: Summative 
Interim Assessment 
Assessment 


Accountability How well is the system 
currently serving 


students? 


How well are students 
learning this year’s 
grade-level standards? 


Progress and 
Preparation 


Promotion Might grade retention 


be a likely possibility for 
any of these students? 


Source. Authors. 
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How well did the 
system serve students? 


Did students achieve 
the grade-level 
expectations? What 
are students’ relative 
strengths and weak- 
nesses based on 
reported subdomains? 


Are these students 
adequately prepared 
for the next grade? 


e Undertake program 
improvement, as needed 

e Implement professional 
learning, as needed 

e Make policy and funding 
decisions 


¢ Triangulate data with 
other sources 

e Identify students’ strengths 
and learning needs 

¢ Target instruction 

¢ Communicate with families 
about students’ learning 
progress 

e Review curricula and pacing 

e Identify needed professional 
learning 


e Plan for and make decisions 
about grade retention 
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To avert this pitfall and more definitively 
evaluate progress toward mastery of 
standards, states may require districts to 
systematize the delivery of instruction by 
following a pacing guide. A pacing guide 
ensures consistency in the order in which 
instructional topics are taught. When a 
pacing guide is used, interim assessments 
can be designed to assess only recently 
instructed content. The assessments thus 
evaluate whether a student has met the 
specified portion of grade-level standards, 
and those particular standards can be 
evaluated with greater depth than cana 
predictive approach. 


For interim assessments to be useful to edu- 
cators and policymakers, assessment validity 
is fundamentally required — that is, alignment 
between the reason for using the assessment 
and the purpose for which the assessment 
was designed (American Educational Research 
Association, American Psychological Associa - 
tion, & National Council on Measurement in 
Education, 2014; Kane, 2006). For instructional 
usefulness, timely reporting of results 

is also critical, enabling teachers to adjust 
strategies such as grouping or targeted 
instruction to meet individual student needs. 
Timeliness of reporting is less essential for 
program improvement and policy planning. 


Screeners and diagnostic assessments. |n 
a school setting, screening assessments — 
commonly referred to as screeners — are 
brief, typically low-cost assessments given 
to all students to provide a basic under- 
standing of students’ discrete skills and 
identify students who may be in need of 
further evaluation. Screeners differ from 
interim assessments in that they are brief 
measures of targeted skills (e.g., pbhonologi- 
cal awareness and oral reading fluency). 
They differ from formative assessment in 
that they are not context dependent (i-e., 
the same screener Is administered to all 
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students), and they are used for identifica- 
tion rather than checking for understanding. 
Literacy and numeracy screeners are usually 
administered at the start of the year, and 
districts may also choose to screen students 
multiple times throughout the year. This 
flexibility is especially relevant in the K-2 
context because learning expectations 
evolve quickly over the course of the year. 
Schools may also use behavioral screeners 
to identify students who might benefit from 
additional social or emotional supports in 
the classroom. 


Importantly, screeners used for academic 
purposes need to be developmentally 
appropriate. For example, kindergarten 
screeners would target critical early literacy 
skills such as letter identification and pho- 
nological awareness, whereas screening for 
reading comprehension is more appropriate 
for students in second grade. In first grade, 
oral reading fluency may be more appropri- 
ate toward the end of the school year than 
at the beginning. 


Diagnostic assessments are used as a fol- 
low-up for individual students for whom the 
screening process identified the need for 
additional support. Diagnostic assessments 
can confirm or add detail to initial screening 
results and provide additional information 
related to eligibility for specialized program- 
ming or services. 


Recently, online curriculum providers — 
whose services include providing screening, 
diagnostic, and interim assessments — have 
prompted expanded use of diagnostic assess- 
ments. Rather than using such assessments 
only for students whose screening identified 
the need for greater support, an increasingly 
common approach is to administer diagnostic 
assessments to all students in order to inform 
decisions about the most appropriate content 
for each student, whether that student is 
below, at, or above grade level. 
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Formative assessment. Formative assess- 
ment is a process, a planned and ongoing 
exchange between teacher and student in 
which each looks for real-time evidence of 
how learning is progressing that informs the 
need for adjustments to teaching and learn- 
ing. In this process, teachers and students 
alike are learners. Teachers use what they 
learn about individual and collective student 
progress to guide their instruction. Students 
use what they learn about their own prog- 
ress to guide their ongoing learning efforts 
(CCSSO, 2018). 


Effective formative assessment is intentional 
and continuous. The process hinges on 

the teacher having clear learning goals and 
communicating those to students, along 
with clear criteria for knowing when students 
have met those goals. Everyone, in short, has 
a clear picture of the various ways in which 
students might demonstrate mastery. Teach- 
ers must also be prepared to respond when 
students are not progressing as expected or 
hoped. Teachers’ use of formative assess- 
ment will necessarily vary, depending on the 
individual student or students. The process 
also includes student agency — that is, a 
learner's willingness and ability to engage 

in self-assessment and to both give and be 
open to accepting peer feedback. Given the 
complexities of formative assessment, class- 
room educators and administrators will likely 
profit from ongoing professional learning 
opportunities on how to use it effectively to 
improve student outcomes. 


Although the formative assessment process 
is nearly universally viewed as a core com- 
ponent of the learning process (Andrade & 
Cizek, 2010; Popham, 2013), little quantita- 
tive research has been done on its efficacy, 
particularly in terms of studies specific to 
children in the earliest grades (Turner & 
Coburn, 2012; Riley-Ayers, 2014). However, 
one recent review of research on the effects 
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of formative assessment in grades K-5 
provides some preliminary guidance on 
where and how to use it most effectively. 
Researchers evaluated 23 studies, published 
between 1988 and 2014, that met the crite- 
ria for What Works Clearinghouse evidence 
standards and procedures (Klute et al., 
2017). Across the reviewed studies, students 
who participated in formative assessment 
performed better on measures of academic 
achievement than those who did not. 
Formative assessment had larger positive 
effects when used during mathematics 
instruction as compared to reading and 
writing instruction. There was also evidence 
that student-directed formative assess- 
ment, including self- and peer assessment, 
was effective specifically for mathematics 
instruction, while educator- or computer- 
program-directed approaches were effec- 
tive for both mathematics and reading. 


Considerations for Building a 
K—2 Assessment System 


How can states build K-2 assessment 
systems that are coherent and effective? 
The starting point is purpose. Educators and 
policymakers need to ask themselves: What 
do we want to learn about our students and 
for what reason? What question(s) do we 
want answered and to what end? Within the 
overall purpose of narrowing the readiness 
gap and improving children’s early grade 
achievement, specific purposes may include 
summarizing learning from a school year to 
support instruction for individual students, 
planning classroom curriculum, providing 
population-level data intended to improve 
program quality, or guiding resource alloca- 
tions. To serve this range of purposes, exist- 
ing early-grade assessment systems include 
multiple types of assessments (Goldstein & 
Flake, 2016). 
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As noted earlier, the use of assessment results 
must be aligned to the assessment's intended 
purpose. That match is essential for assess- 
ment validity — the foundation of an effec- 
tive assessment system. In addition, states 
fundamentally need to address where, when, 
and how data resulting from assessments will 
be used to improve learning for K—2 students 
who are in need of greater support. 


As policy leaders strive to create an effective 
K—2 assessment system, state experiences to 
date offer considerations that can help inform 
their efforts. These include the following: 


Summative assessments provide a 
common statewide performance metric 
but alone are insufficient. A common 
end-of-year metric that reflects the depth 
and breadth of students’ content learning 
aligned to the state standards is important 
for measuring achievement of those stan- 
dards. When adapted for the early grades 
(K—2), summative assessments can inform 
policy and program decisions and identify 
opportunities for professional learning by 
showing how well student performance 
aligns to grade-level expectations. 


Systematic monitoring of student learning 
based on a common metric is especially 
critical in states with laws that require 

the retention of students who were not 
proficient in reading by third grade. As of 
2018, 16 states were developing or had 
instituted such laws (Weyer, 2018). These 
statutes show a focus on literacy over other 
aspects of learning and development. They 
also point to a growing need for state K-2 
assessment systems that include identifica- 
tion of students who face the likelihood 

of retention, along with an evaluation of 
their strengths and weaknesses, so that 
educators can tailor instructional support 
to specific learning needs. While summa- 
tive assessments can identify districts and 
schools that serve students in need of 
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greater support to succeed, they do not 
provide enough data on individual students 
to guide student-specific action. Moreover, 
summative assessment results typically 
arrive after students have moved on from 
the assessed grade. 


Of the 35 states in the CCSSO report that 
offered some type of statewide assessment 
in grades K-2, only four — Georgia, Indiana, 
Michigan, and Tennessee — included end- 
of-year assessments (CCSSO, 2019). In 
Georgia, kindergarten teachers complete 

a year-long performance-based assess- 
ment, GKIDS, designed to provide ongoing 
diagnostic information about kindergarten 
students’ developing skills in English lan- 
guage arts, mathematics, science, social 
studies, personal/social development, and 
approaches to learning. GKIDS is charac- 
terized both as a summative assessment, 
since it provides a summary of student 
performance in English language arts and 
mathematics at the end of the kindergarten 
school year, and also as a formative assess- 
ment, since it supports kindergarten teach- 
ers to plan instruction throughout the year 
(Georgia Department of Education, 2018). 


Tennessee provides an optional summative 
assessment of early literacy and mathematics 
skills for second-grade students, designed 
to inform both second- and third-grade 
teachers of students’ mastery of the stan- 
dards and to support schools and districts 

as they measure their progress toward the 
goal of having 75 percent of third graders 
reading on grade level by 2025 (Tennessee 
Department of Education, 2017). Indiana and 
Michigan both had summative assessments 
in the primary grades as of the CCSSO's 
2019 reporting, but early grade summative 
assessments are no longer part of those 
states’ assessment systems. In Indiana, the 
Indiana Reading Evaluation and Determina- 
tion (IREAD-3) is a summative assessment 
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administered to third graders that provides a 
measure of mastery of foundational reading 
standards through grade 3 (Indiana Depart- 
ment of Education, 2019). In Michigan, there 
are benchmark (interim) assessments of early 
literacy and mathematics skills administered 
in the fall, winter, and spring in kindergar- 
ten and first and second grade. The state 
provides the assessments for this use, but 
districts also have the option to select their 
own assessments (Michigan Department of 
Education, n.d.-a; Michigan Department of 
Education, 2019). 


Interim assessments may have greater 
utility for tracking the development of 
early reading and mathematics skills. In 
the absence of state summative assess- 
ments, interim assessments can support 
district and school staff in assessing student 
learning. Interim assessments may, in fact, 
be more useful to educators because they 
are administered more frequently and typi- 
cally structured with more targeted content 
than summative assessments. States can 
mandate a specific commercially or locally 
developed assessment or offer districts a 
list of state-approved assessments, as is 
done in Louisiana and Michigan (Louisiana 
Department of Education, n.d.-a; Michigan 
Department of Education, n.d.-b). 


State-approved lists have the advantage 

of allowing districts to align assessments 
to their students’ needs. Different assess- 
ments, however, may measure slightly 
different constructs — for example, the 
conceptualization of early reading skills may 
vary somewhat across assessments. States 
that permit districts to provide their own 
interim solution should require evidence 
that the chosen assessments are aligned to 
state standards and measure the depth and 
breadth of the assessable state standards. 


CCSSO reported that approximately 70 
percent of the 26 states that offered an 


Policy Perspectives 


interim assessment required districts to 
report results to the state (CCSSO, 2019). 
But comparing those results statewide can 
be a challenge if district-selected assess- 
ments lack a common metric for “profi- 
ciency” or “on track to proficiency.” One 
solution is for states to institute a standard- 
setting procedure that supports a common 
target for the identification of students who 
are on track for achieving proficiency and 
those who are not. 


An alternative to interim assessments for 
gauging early reading and mathematics 
skills is the use of screeners. A growing 
number of states are integrating reading 
screeners into state systems, a shift that 
coincides with a rising number of third- 
grade reading retention laws across the 
country (CCSSO, 2019). One advantage of 
screeners is that they measure targeted 
skills that are demonstrably associated with 
the learning outcome of interest. Because 
screener assessments are targeted, they 
tend to be less time-consuming than 
interim or summative assessments. 


A system of screeners at key points between 
kindergarten and second grade can support 
districts in identifying students who, without 
added support, face the likelinood of reten- 
tion in third grade. Further, states should 
consider using mathematics as well as 
reading screeners in the early grades, since 
early mathematics skills have a demonstrated 
association with academic success in later 
elementary grades (Watts et al., 2014). 


Including support for formative assessment 
in a state system helps enable real-time 
classroom intervention and remediation. 
Assessments that involve a time lag of 
weeks or months before results are avail- 
able have limited instructional usefulness, 
since remediation requires that teachers 
return to content taught long before. In 
contrast, the formative assessment process 
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allows teachers to adjust instruction in real 
time, based on student progress (CCSSO, 
2018). In recognition of the importance 

of formative assessment in state-level 
assessment systems, the Smarter Balanced 
Assessment Consortium provides member 
states with activities and lessons to support 
districts in using the formative assessment 
process (Smarter Balanced, 2020). Similarly, 
Tennessee is expanding its Tennessee 
Comprehensive Assessment Program to 
include state-provided summative, interim, 
and formative components (Tennessee 
Department of Education, 2020). (For the 
purposes of its assessment system, Tennes- 
see defines formative assessments as short 
assessments that cover a limited number of 
the Tennessee Academic Standards in each 
assessment, which can be administered by a 
teacher as desired.) 


Conclusion 


Robust assessment systems in K-2 can 
provide data to evaluate school and dis- 
trict performance, gauge young students’ 
progress against state standards, and help 
educators identify and intervene early with 
students who may need added support 

to achieve desired learning outcomes. 
Model state assessment systems for the 
early grades include summative and interim 
assessments as well as screeners and 
formative assessment. Summative assess- 
ments offer schools and districts an annual 
evaluation of overall student performance 
and can serve as an indicator of the need to 
focus supports on individual students who, 
without intervention, would appear to be on 
track for poor performance in the upcom- 
ing year. Interim assessments are more 
instructionally useful since they are frequent 
and more targeted. Screeners, similarly, can 
help teachers identify students who need 
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specific kinds of academic support. The 
utility of these tools is enriched throughout 
the year with formative assessment. The 
key is designing assessment systems that 
include multiple kinds of assessments that 
balance out the strengths and limitations of 
each type in the service of helping the state 
improve learning for all K-2 students. 
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